-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16005 object: check resent coll_punch on leader and relay engine #14659
Conversation
Ticket title is 'aurora soak stress: Pool connect issues/pool query appear to be hanging' |
8998c7b
to
e6e69c2
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14659/2/execution/node/1551/log |
c899d12
to
8c4db79
Compare
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14659/5/execution/node/1505/log |
8c4db79
to
a5959eb
Compare
For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
a5959eb
to
74de330
Compare
Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14659/10/testReport/ |
test_dfuse_daos_build_wb failed for DAOS-16215, not related with the patch |
if (age >= DAOS_AGG_THRESHOLD) | ||
D_WARN("DTX "DF_DTI" (state:%u, age:%u) still references the data, " | ||
"cannot be (VOS) aggregated\n", | ||
DP_DTI(&DAE_XID(dae)), vos_dtx_status(dae), age); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to rate limit this warning message, otherwise, the log file will be flooded with such message when some data failed to be DTX committed for a long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning message may be triggered for different DTX entries. Limiting the rate may miss some DTX information? Not sure whether we have some efficient way to skip repeated log message or not.
…daos-stack#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Allow-unstable-test: true Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Allow-unstable-test: true Signed-off-by: Fan Yong <fan.yong@intel.com>
…#14659) (#14826) For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong. Signed-off-by: Fan Yong <fan.yong@intel.com>
For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: